Skip to content

feat(shader-driver): wire content-plane Hamming cascade — dispatch sees real similarity#259

Merged
AdaWorldAPI merged 3 commits into
mainfrom
claude/hamming-content-cascade
Apr 24, 2026
Merged

feat(shader-driver): wire content-plane Hamming cascade — dispatch sees real similarity#259
AdaWorldAPI merged 3 commits into
mainfrom
claude/hamming-content-cascade

Conversation

@AdaWorldAPI

Copy link
Copy Markdown
Owner

Summary

  • Adds a content-plane Hamming pre-pass to ShaderDriver::run() before the existing palette cascade. For every pair in passed_rows, popcount the XOR of their content fingerprints; if resonance = 1 - Hamming/16384 >= style.resonance_threshold, emit a ShaderHit{predicates: 0x01} (content-match flag).
  • Result: dispatch() now surfaces real text similarity. Previously the PaletteSemiring cascade probed a synthetic Base17 distance table unrelated to the encoded text, and the content plane was read only for the cycle_fp XOR fold, never compared. The audit in EPIPHANIES 2026-04-24 "Ground truth: ShaderDriver dispatch wiring audit" called out Option A (content Hamming pre-pass) as the fix — this PR wires it.
  • Palette cascade kept intact (complementary edge-palette signal). Guard: skip the N^2 sweep when passed_rows.len() > 256.

The problem, before

encode "Palantir develops surveillance systems"     -> row 0 (247 bits)
encode "Palantir Gotham is a surveillance platform" -> row 1 (249 bits)
encode "Israel deploys military AI"                 -> row 2 (259 bits)
dispatch rows 0..3, style=analytical
  -> hit_count: 0, top_k: [], confidence: 0.0, admit_ignorance: true

Zero hits across every style. The Jirak signal (3σ at Hamming < 454) was present in the data, but invisible to dispatch().

The fix, after

dispatch rows 0..3, style=creative (threshold 0.35)
  hit_count: 6
  top_k:
    row 0, distance 6592, predicates 0x01, resonance 0.598   <- Palantir / Palantir pair
    row 1, distance 6592, predicates 0x01, resonance 0.598
    row 1, distance 7232, predicates 0x01, resonance 0.559
    row 2, distance 7232, predicates 0x01, resonance 0.559
    row 0, distance 8256, predicates 0x01, resonance 0.496
    row 2, distance 8256, predicates 0x01, resonance 0.496
  confidence 0.598, admit_ignorance: false

The strongest pair (rows 0↔1, both Palantir) correctly ranks first. Rows 0↔2 (Palantir vs Israel AI) lands lowest. Analytical's 0.85 threshold correctly rejects all pairs (none cross 0.85). Peripheral (0.20) matches creative here — same pairs survive because the lowest pair is still at 0.496 > 0.20.

Threshold model

Resonance-over-threshold rather than an absolute bit count, because the 32x DeepNSM tiling pushes content-plane density from 0.016 to ~0.48 and the Jirak-calibrated 454-bit reference no longer applies directly. Per-style thresholds from UNIFIED_STYLES.resonance_threshold:

Style Threshold Regime
focused 0.90 strictest
analytical 0.85 strict
convergent 0.75 strict
deliberate 0.70 standard
systematic 0.70 standard
intuitive 0.50 loose
metacognitive 0.50 loose
diffuse 0.45 loose
divergent 0.40 loose
creative 0.35 loose
exploratory 0.30 loosest
peripheral 0.20 loosest

Cross-ref: EPIPHANIES 2026-04-24 "Jirak noise floor calibrated for DeepNSM-tiled 16K-bit fingerprints" + "Ground truth: ShaderDriver dispatch wiring audit".

Test plan

  • cargo test -p cognitive-shader-driver — 45 pass (43 lib + 2 e2e, 3 new)
    • content_hamming_finds_similar_rows — two rows differ in 4 bits, analytical finds them at resonance > 0.5.
    • content_hamming_skips_dissimilar — two rows with Hamming ~10000 (resonance ~0.39), analytical correctly emits 0 content hits.
    • content_hamming_respects_style_threshold — Hamming ~5000 (resonance ~0.695), analytical 0 content hits, creative >= 1 content hits; monotonicity assertion.
  • Live verification: run shader-serve, encode 3 related rows, dispatch analytical/creative/peripheral; observe predicates=0x01 hits with correct pair ordering.
  • No regressions in pre-existing dispatch tests (meta prefilter, sink short-circuit, cycle fingerprint emission).

Known follow-ups

  • Palette cascade hits still intermix via shared resonance sort — in production with meaningful planes, high-resonance palette hits can truncate content-match hits out of top_k (observed in the respects-style-threshold test when using demo_planes; resolved there by isolating with empty planes). If future tuning shows content drowning, consider a small content-hit resonance bonus or preserving the top-K content + top-K palette separately before the final merge.
  • The Jirak-derived absolute Hamming threshold (454 at 3σ) is density-dependent; document on the encode handler that the 32x tiling deviates from the calibration baseline. Not a blocker for this PR.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh


Generated by Claude Code

claude added 2 commits April 24, 2026 20:51
Adds a content-plane Hamming pre-pass to ShaderDriver::run() before the
existing palette cascade. For every pair of rows surviving the meta
prefilter, popcount the XOR of their content fingerprints; if
resonance = 1 - Hamming/16384 meets the style's resonance_threshold,
emit a ShaderHit with predicates=0x01 (content-match flag).

Why this matters: the palette cascade probes a pre-computed distance
table built from synthetic Base17 entries unrelated to actual content.
Before this change, dispatch() on encoded text returned 0 hits, 0
resonance, admit_ignorance=true regardless of semantic similarity.
The content plane was read only for cycle_fp XOR folding, never
compared.

Threshold model: resonance vs style.resonance_threshold (density-
agnostic). Analytical (0.85) is strict; creative (0.35) is loose;
focused (0.90) strictest; peripheral (0.20) loosest. See EPIPHANIES
2026-04-24 "Jirak noise floor calibrated for DeepNSM-tiled 16K-bit
fingerprints" for the Jirak-derived 3sigma reference point.

Guard: skip the N^2 sweep when passed_rows.len() > 256 (4096 rows ->
16M popcount x 256 comparisons = heavy).

Live verification (encode 3 rows, dispatch rows 0..3):
  "Palantir develops surveillance systems"        (row 0)
  "Palantir Gotham is a surveillance platform"    (row 1)
  "Israel deploys military AI"                    (row 2)

Analytical (0.85): 0 hits (all pairs resonance < 0.85) - correct.
Creative   (0.35): 6 hits, all predicates=0x01.
  rows 0<->1 (Palantir  / Palantir)    resonance 0.598  <- strongest
  rows 1<->2 (Palantir  / Israel)      resonance 0.559
  rows 0<->2 (Palantir  / Israel)      resonance 0.496  <- weakest
Confidence rose from 0 -> 0.598; admit_ignorance flipped to false.

Tests: 3 new (content_hamming_finds_similar_rows,
content_hamming_skips_dissimilar, content_hamming_respects_style_threshold).
All 45 tests pass (43 lib + 2 e2e).

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Records the hamming-content-cascade PR result: dispatch() now surfaces
content-match hits on semantically similar text rows.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7062258928

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +124 to +126
for (i, &row_i) in passed_rows.iter().enumerate() {
let fp_i = self.bindspace.fingerprints.content_row(row_i as usize);
for (j_off, &row_j) in passed_rows.iter().enumerate().skip(i + 1) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bound content pre-pass by max_cycles

The new content-plane pre-pass iterates all passed_rows pairs (up to 256) before the existing max_cycles limit is applied in the main cascade loop, so a request with a small budget (for example max_cycles = 1) can still perform an O(n²) sweep and return content hits from rows that would never be processed as cycles. This violates the ShaderDispatch.max_cycles budget contract and can add large unexpected latency; the pre-pass should use the same bounded row subset as the cycle loop.

Useful? React with 👍 / 👎.

…t-cascade

# Conflicts:
#	.claude/board/AGENT_LOG.md
@AdaWorldAPI AdaWorldAPI merged commit b21c037 into main Apr 24, 2026
0 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants